Conditions on Consistency of Probabilistic Tree Adjoining Grammars
نویسنده
چکیده
Much of the power of probabilistic methods in modelling language comes from their ability to compare several derivations for the same string in the language. An important starting point for the study of such cross-derivational properties is the notion of consistency. The probability model de ned by a probabilistic grammar is said to be consistent if the probabilities assigned to all the strings in the language sum to one. From the literature on probabilistic context-free grammars (CFGs), we know precisely the conditions which ensure that consistency is true for a given CFG. This paper derives the conditions under which a given probabilistic Tree Adjoining Grammar (TAG) can be shown to be consistent. It gives a simple algorithm for checking consistency and gives the formal justi cation for its correctness. The conditions derived here can be used to ensure that probability models that use TAGs can be checked for de ciency (i.e. whether any probability mass is assigned to strings that cannot be generated).
منابع مشابه
State-Split for Hypergraphs with an Application to Tree Adjoining Grammars
In this work, we present a generalization of the state-split method to probabilistic hypergraphs. We show how to represent the derivational stucture of probabilistic tree-adjoining grammars by hypergraphs and detail how the generalized state-split procedure can be applied to such representations, yielding a state-split procedure for tree-adjoining grammars.
متن کاملPreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars
Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...
متن کاملDeveloping a TT-MCTAG for German with an RCG-based Parser
Developing linguistic resources, in particular grammars, is known to be a complex task in itself, because of (amongst others) redundancy and consistency issues. Furthermore some languages can reveal themselves hard to describe because of specific characteristics, e.g. the free word order in German. In this context, we present (i) a framework allowing to describe tree-based grammars, and (ii) an...
متن کاملNonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows pe...
متن کاملMultiple Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars
Strong lexicalization is the process of turning a grammar generating trees into an equivalent one, in which all rules contain a terminal leaf. It is known that tree adjoining grammars cannot be strongly lexicalized, whereas the more powerful simple context-free tree grammars can. It is demonstrated that multiple simple context-free tree grammars are as expressive as multi-component tree adjoini...
متن کامل